chore: add schema for UC Metric Views on Analytics plugin#429
Conversation
…hema) Author the metric.json config contract as Zod in packages/shared/src/schemas/metric-source.ts (single source of truth) and generate the published JSON Schema via tools/generate-json-schema.ts into docs/static/schemas/metric-source.schema.json. metric.json declares the Unity Catalog Metric View sources (sp/obo lanes) that opt an app into the analytics metric-view path. Reconciles PR1 of the metric-views stack onto main's Zod-first schema convention; #341 authored this JSON-first with a separate generated type. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
Validate metricSourceSchema directly via safeParse (no Ajv): accepts sp-only / mixed sp+obo / empty configs; rejects bare-string entries, missing source, unknown entry and top-level fields, invalid metric keys (leading digit, hyphen), and malformed source FQNs. Ports the #341 case set to main's Zod-first validation idiom. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
Trim the module header, move the object-entry rationale to a @note on metricEntrySchema, and drop the section banners. Comment-only — the Zod schema, describe() strings, and generated JSON are unchanged. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
There was a problem hiding this comment.
Pull request overview
Adds the initial schema contract for config/queries/metric.json (authored in Zod and emitted as JSON Schema) to support the upcoming analytics metric-view stack.
Changes:
- Introduces a new Zod schema (
metricSourceSchema) that defines the allowed shape formetric.json(closed top-level object with optional$schema,sp,obo). - Adds Vitest coverage for accepted/rejected configurations (metric key pattern, strict objects, 3-part UC FQN).
- Extends the JSON Schema generation script to emit and publish
metric-source.schema.jsonunderdocs/static/schemas.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tools/generate-json-schema.ts | Emits and writes the generated JSON Schema for the new metric-source Zod schema. |
| packages/shared/src/schemas/metric-source.ts | Defines the Zod source-of-truth schema and inferred TS types for metric.json. |
| packages/shared/src/schemas/metric-source.test.ts | Adds schema validation tests (valid/invalid configs). |
| docs/static/schemas/metric-source.schema.json | Generated draft-07 JSON Schema artifact served by docs for editor $schema validation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The SP-only case carried an explicit obo: {} (ported verbatim from the
reference); empty-lane coverage already lives in the empty-configuration
case. Review feedback on #429.
Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
There was a problem hiding this comment.
Maybe this was already discussed and I missed it, but I don't fully agree with having the sp/obo as the top key. Basically, currently, it would be:
{
"sp": { "revenue": { "source": "main.analytics.revenue" } },
"obo": { "my_orders": { "source": "main.sales.orders_by_user" } }
}I'd prefer:
{
"metrics": {
"revenue": { "source": "main.analytics.revenue", "execute_as": "app_service_principal" },
"my_orders": { "source": "main.sales.orders_by_user", "execute_as": "user" }
}
}The reasons why I believe this would be more intuitive are:
- The query execution principal is not that important of a concept to have it as the first-level key. In fact, maybe many users don't even care (and we can just use a default for them?).
- Having it as a key means that it's hard to even know what it is. IMO keys in an object should represent entities of the parent object.
- Naming of
sp/obois quite hard to understand. I'd prefer full text (at least accept it), ie,service_principalandon_behalf_of, or ideallyapp_service_principalanduserto be more specific.
…xecutor
Replace the sp/obo lane sections with a single 'metrics' map; the
execution principal moves into each entry as 'executor'
("service_principal" | "user"), defaulting to service_principal —
consistent with plain .sql queries executing as SP.
Entity-first also makes metric keys unique by construction: the same key
can no longer be declared in two lanes, so the cross-lane duplicate rule
(previously unexpressible in the schema and enforced post-parse) becomes
unrepresentable.
Review feedback from calvarjorge on #429.
Co-authored-by: Isaac
Signed-off-by: Atila Fassina <atila@fassina.eu>
|
@calvarjorge agreed and adopted in 917242a (value renamed to app_service_principal in 5fc4a00) — the schema is now an entity-first {
"metrics": {
"revenue": { "source": "main.analytics.revenue" },
"my_orders": { "source": "main.sales.orders_by_user", "executor": "user" }
}
}Bonus your shape bought us: with a single map, metric keys are unique by construction — the old two-lane shape could express the same key in both |
…_principal More specific about whose service principal executes the query — the app's. Bare service_principal is now a rejected value. Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
calvarjorge
left a comment
There was a problem hiding this comment.
Overall looks good - only one last thing: the concept of metric is quite broad. Maybe we want to be more specific and refer to this as metric_view? Specially if we might have other concepts similar to metrics in the future (telemetry metrics, etc.)
The entries are UC metric views, not generic metrics — the key now says so. camelCase per the repo's authored-config-key convention (manifest keys like displayName/dependsOn; snake_case is reserved for values and fields mirroring Databricks APIs). Co-authored-by: Isaac Signed-off-by: Atila Fassina <atila@fassina.eu>
|
Follow-up on the shape discussion: renamed the root key {
"metricViews": {
"revenue": { "source": "main.analytics.revenue" },
"my_orders": { "source": "main.sales.orders_by_user", "executor": "user" }
}
} |
What
PR1 of the metric-views delivery stack: the schema contract for
metric.json— the opt-in config that activates the analytics metric-view path. Ships in thesharedpackage.This is the typed contract only — no runtime, CLI, hook, UI, or demo. It lands inert: nothing executes until an app actually ships a
config/queries/metric.json.Why now / why it's safe
This re-ships part of #341 (a ~15k-line mega-PR) as a small, reviewable increment. The feature is opt-in and dormant (every metric-view code path gates on
metric.jsonexisting), and these are all new files — so it lands onmainwithout touching any existing behavior.What's in it — 4 files, +267
packages/shared/src/schemas/metric-source.tsmetricSourceSchema+ inferredMetricSource/MetricExecutortypestools/generate-json-schema.tsdocs/static/schemas/metric-source.schema.json$schemaautocomplete + validation)packages/shared/src/schemas/metric-source.test.tssafeParsevalidation casesThe
metric.jsoncontract{ "$schema": "https://databricks.github.io/appkit/schemas/metric-source.schema.json", "metricViews": { "revenue": { "source": "main.analytics.revenue" }, "my_orders": { "source": "main.sales.orders_by_user", "executor": "user" } } }{ $schema?, metricViews? }, closed (rejects unknown keys).metricViewsis a single map of metric key → entry. One map (rather than per-executor sections) makes metric keys unique by construction — the route key space can't collide.^[a-zA-Z_][a-zA-Z0-9_]*$— becomes the route key (POST /api/analytics/metric/:key), theuseMetricView('<key>', …)argument, and theMetricRegistryaugmentation key.<catalog>.<schema>.<metric_view>."app_service_principal"(default) runs as the app service principal with a shared cache;"user"runs on-behalf-of the requesting user with a per-user cache. Defaulting toapp_service_principalmatches plain<key>.sqlqueries executing as SP.cacheTtl,defaultFilter, allowlists) are additive —executoris the first such option.Reconciliation note (for reviewers comparing against #341)
#341 authored this JSON-Schema-first (hand-written
.schema.json→ generated.tsvia agenerate-schema-types.tstool). Since #341's base,mainadopted a Zod-first convention in the manifest refactor (#261): Zod is the single source of truth and the JSON Schema is generated from it viatools/generate-json-schema.ts. This PR re-expresses the contract in main's current idiom. Consequences vs. the #341 reference:*.generated.ts— the inferredMetricSourcetype replaces it.ajv/ajv-formats— validation rides the Zod schema directly viasafeParse, matchingvalidate-manifest.ts..schema.json— the generated JSON lives only indocs/static/schemas/, exactly like the manifest schemas.Note the shape also differs from #341 (see revision note above): downstream slices (runtime, typegen, CLI) port their config-reading layer against this revised contract.
Testing
pnpm build && pnpm docs:build✅pnpm check:fix && pnpm -r typecheck✅pnpm test✅ —metric-source.tsat 100% coverage. The generator is deterministic (re-running produces no diff) and does not drift the existing manifest/template schemas. Tests include rejection of the legacysp/oboshape and verification that theexecutordefault materializes on parse.Stack context
Part of re-shipping #341 as a stacked chain (merge order PR0 → PR1 → PR3 → PR4 → PR2 → PR5 → PR6):
x-forwarded-userin core OBO pathmetric.jsonschemaOnly hard dependency on this PR: PR4 (
appkit metric syncCLI) imports this schema and will validate via the Standard Schema interface (no Ajv).This pull request and its description were written by Isaac.